Lecture 4 Biological enrichment calculations and GO/GSEA

04/06/2021

General Enrichment Calculation: Applications

  • splicing
  • enrichment of SNPs in epigenomics data
  • allele specific expression (ASE)

General Enrichment Calculation: Applications

Any problem involving count data where the underlying probability is not known but a suitable “background” condition is available for comparison

Enrichment in ranked lists

Online methods

How GSEA Works

shamelessly stolen from: Hector Corrada Bravo

How GSEA Works

How GSEA Works

How GSEA Works

Ontologies

“In computer science and information science, an ontology encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities that substantiate one, many or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of concepts and categories that represent the subject.”

(from Wikipedia)

Ontologies

Gene Ontology is a curated graph of terms

  • Molecular Function (e.g. “adenylate cyclase activity”)
  • Cellular component (e.g. “ribosome”)
  • Biological Process (e.g. “DNA repair”)

Other Useful Ontologies

Other Useful Ontologies

Reactome is an expert-authored, peer-reviewed knowledgebase of reactions and pathways.

  • Manually curated human pathways with experimental evidence (regarded highest quality)
  • Manually inferred pathways for other organism (e.g. Gallus gallus, Mus musculus)

Reactome is useful when…

  • Know molecular details of a pathway based on literature (e.g. directed pathway)
  • Learn crosstalk between pathways (e.g. shared genes/reactions)

Other Useful Ontologies

Navigating Reactome

  • Webpage provides an easy way to access, browse, analyse and download pathway data

Other Useful Ontologies

Navigating Reactome

  • Pathway browser

Other Useful Ontologies

Navigating Reactome

  • Pathway Structure

Other Useful Ontologies

MSigDB

  • Hallmark genesets
  • Canonical pathways
  • Regulatory Target genesets
  • disease genesets
  • many cancer sets
  • Gene Ontology

Finally: A word about the construction and limitations of all ontologies

Finally: A word about the construction and limitations of all ontologies

  • this is how genes relate to multiple terms

Finally: A word about the construction and limitations of all ontologies

  • this is how genes relate to multiple terms
  • the usefulness of GO terms varies greatly throughout the graph